Word Translation Prediction for Morphologically Rich Languages with Bilingual Neural Networks
نویسندگان
چکیده
Our approach enables: • accurate prediction of target translation stem and suffix given fixed amount of context • automatic learning of relevant features with neural network architecture Choosing the correct surface form requires linguistic features of source and target context: • in phrase-based SMT, access to source context depends on phrase segmentation • linguistic features depend on available annotation tools and manual feature engineering
منابع مشابه
Word Representation Models for Morphologically Rich Languages in Neural Machine Translation
Dealing with the co mplex word forms in morphologically rich languages is an open problem in language processing, and is particularly important in translation. In contrast to most modern neural systems of translation, which discard the identity for rare words, in this paper we propose several architectures for learning word representations from character and morpheme level word decompositions. ...
متن کاملA Distributed Inflection Model for Translating into Morphologically Rich Languages
Lexical sparsity is a major challenge for machine translation into morphologically rich languages. We address this problem by modeling sequences of fine-grained morphological tags in a bilingual context. To overcome the issue of ambiguous word analyses, we introduce soft tags, which are under-specified representations retaining all possible morphological attributes of a word. In order to learn ...
متن کاملLearning Bilingual Phrase Representations with Recurrent Neural Networks
We introduce a novel method for bilingual phrase representation with Recurrent Neural Networks (RNNs), which transforms a sequence of word feature vectors into a fixed-length phrase vector across two languages. Our method measures the difference between the vectors of sourceand target-side phrases, and can be used to predict the semantic equivalence of source and target word sequences in the ph...
متن کاملProviding Morphological Information for SMT Using Neural Networks
Treating morphologically complex words (MCWs) as atomic units in translation would not yield a desirable result. Such words are complicated constituents with meaningful subunits. A complex word in a morphologically rich language (MRL) could be associated with a number of words or even a full sentence in a simpler language, which means the surface form of complex words should be accompanied with...
متن کاملInduction of Fine-Grained Part-of-Speech Taggers via Classifier Combination and Crosslingual Projection
This paper presents an original approach to part-of-speech tagging of fine-grained features (such as case, aspect, and adjective person/number) in languages such as English where these properties are generally not morphologically marked. The goals of such rich lexical tagging in English are to provide additional features for word alignment models in bilingual corpora (for statistical machine tr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014